Skip to content

feat(opentelemetry-collector): add VPA and in-place resize support#2076

Open
boqu wants to merge 9 commits intoopen-telemetry:mainfrom
boqu:add-support-for-vpa
Open

feat(opentelemetry-collector): add VPA and in-place resize support#2076
boqu wants to merge 9 commits intoopen-telemetry:mainfrom
boqu:add-support-for-vpa

Conversation

@boqu
Copy link
Copy Markdown
Contributor

@boqu boqu commented Feb 14, 2026

Summary

  • Add VerticalPodAutoscaler (VPA) support to the opentelemetry-collector chart, following the pattern already established in the opentelemetry-operator chart
  • Add container resizePolicy support for Kubernetes in-place pod resource resize (>= 1.27)

VPA

  • New templates/vpa.yaml template gated on autoscaling.k8s.io/v1 API capability and verticalPodAutoscaler.enabled
  • Supports all three modes: daemonset, deployment, statefulset
  • Configurable: recommenders, controlledResources, maxAllowed, minAllowed, updatePolicy, and full containerPolicies override
  • New vpaKind helper in _helpers.tpl
  • Values schema validation in values.schema.json
  • CI test file ci/vpa-deployment-values.yaml

In-place resize

  • New resizePolicy field on the collector container in _pod.tpl
  • Supports NotRequired and RestartContainer restart policies per resource

@boqu boqu requested review from a team, TylerHelmuth, dmitryax and povilasv as code owners February 14, 2026 08:48
Comment thread charts/opentelemetry-collector/values.yaml Outdated
Comment thread charts/opentelemetry-collector/values.yaml Outdated
Comment thread charts/opentelemetry-collector/templates/vpa.yaml Outdated
@boqu
Copy link
Copy Markdown
Contributor Author

boqu commented Feb 17, 2026

@TylerHelmuth Thank you for your review. I pushed all the changes according to your review. Please let me know.

@boqu
Copy link
Copy Markdown
Contributor Author

boqu commented Feb 17, 2026

The CI failed due missing VPA CRD. See https://github.com/open-telemetry/opentelemetry-helm-charts/actions/runs/22117983560/job/63930896483?pr=2076.

VPA CRDs are not part of core Kubernetes. Is it ok to add VPA CRDs in the CI test cluster?

@dmitryax
Copy link
Copy Markdown
Member

I would be strongly against bringing non-standard crds to this chart. Is there currently a way to install VPA separately?

@TylerHelmuth
Copy link
Copy Markdown
Member

@boqu does .Values.extraManifest work here?

@boqu
Copy link
Copy Markdown
Contributor Author

boqu commented Feb 18, 2026

@boqu does .Values.extraManifest work here?

I added a minimal stub VPA CRDs here. But it's failing with below error. If my understanding is correct, helm fails when resolving all resource types against the cluster API servers before applying anything. So it looks like we still need to install the CRDs before the CI test.

Error: INSTALLATION FAILED: unable to build kubernetes objects from release manifest: resource mapping not found for name: "opentelemetry-collector-o6n8pl90u0" namespace: "opentelemetry-collector-o6n8pl90u0" from "": no matches for kind "VerticalPodAutoscaler" in version "autoscaling.k8s.io/v1"

@boqu
Copy link
Copy Markdown
Contributor Author

boqu commented Feb 18, 2026

I would be strongly against bringing non-standard crds to this chart. Is there currently a way to install VPA separately?

I think we only need to install the CRDs in the github action to run the CI test. If this is not acceptable, I can simply remove CI test.

@boqu
Copy link
Copy Markdown
Contributor Author

boqu commented Feb 18, 2026

I install the VPA CRD in the github setup action. The CI test can pass now.


# Vertical Pod Autoscaler - https://kubernetes.io/docs/concepts/workloads/autoscaling/#vertical-pod-autoscaling
verticalPodAutoscaler:
enabled: false
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have misunderstood how this feature is implemented in kubernetes. Is this feature not available in a standard +1.27 k8s install? What are the additional CRDs it needs?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the VPA doesn't come with kubernetes in default. It need to be installed separately. See https://kubernetes.io/docs/concepts/workloads/autoscaling/#scaling-workloads-vertically and https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler. The CRD can be found here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've typically been against adding direct support for external CRDs, but seeing as how this is a kubernetes-owned CRD I'm ok adding this feature. @open-telemetry/helm-approvers what do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I know if we can move this PR forward or anything outstanding concern?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd appreciate if we can move this PR forward.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TylerHelmuth Anything I can do to move this PR forward?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be ok with having native VPA support.

But didnt understand why extraManifess didnt work?

This error seems that the VPA CRDs wasnt installed on that cluster:

Error: INSTALLATION FAILED: unable to build kubernetes objects from release manifest: resource mapping not found for name: "opentelemetry-collector-o6n8pl90u0" namespace: "opentelemetry-collector-o6n8pl90u0" from "": no matches for kind "VerticalPodAutoscaler" in version "autoscaling.k8s.io/v1"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is, before applying anything, the helm renders the whole release and runs every manifest through the client‑side REST mapper to resolve its GVK(Group, Version and Kind) against the API server's discovery. At that moment, the CRD added via extraManifest has not been posted. Thus, helm fails to build VPA object. That also explains why it works if we pre-install VPA CRD in the CI test. See .github/actions/setup/action.yaml.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya you can't install a CRD via the manifest. Managing CRDs in the helm chart that also installs something that relies on those CRDs is really hard (see the kube-stack chart). If you install the CRDs separately I'd expect the extraManifest to work. We will definitely not be adding the CRDs to this chart, so no matter what you'll need to manage the CRDs elsewhere.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I probably didn't describe the PR clearly in the first place. The OTEL collector helm chart won't install VPA CRDs. If users use VPA with the helm chart, they need to make sure the VPA CRDs have been installed already. I just updated the comment to make it clear.

I also bump the version to resolve the merge conflicts.

Can we consider to move this PR forward?

@github-actions
Copy link
Copy Markdown
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions Bot added the Stale label Mar 18, 2026
@boqu boqu force-pushed the add-support-for-vpa branch from e094768 to a5e0b92 Compare March 21, 2026 02:22
@github-actions github-actions Bot removed the Stale label Mar 21, 2026
@boqu
Copy link
Copy Markdown
Contributor Author

boqu commented Mar 30, 2026

Any chance we can move this PR forward?

Bump chart to 0.153.0 to resolve version conflict with upstream (0.152.0)
and re-render examples. Strengthen VPA comment in values.yaml to note
that VPA CRDs and controller must be installed separately — this chart
does not install them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants